Last-gen nostalgia: a lighthearted rant and reflection on genome sequencing culture

نویسنده

  • David Roy Smith
چکیده

I sometimes see them in my dreams. The colorful peaks and troughs, the sharp, crisp waves spread across my computer screen, the rolling nitrogenous mountains, each with its own nucleotide sitting solidly on the summit. I’m talking about electropherograms, of course. Remember them? Those beautiful but oh so “oldgen” bioinformatics data generated from automated Sanger sequencing machines, such as the Applied Biosystems 370— the geriatric of genome sequencers. Don’t laugh. It was these capillary-based electrophoretic technologies that gave us the draft human genome sequence (Lander et al., 2001) and the genomemaps of many other model organisms, from the bacterium Haemophilus influenza to the yeast Saccharomyces cerevisiae to the multicellular green alga Volvox carteri (Fleischmann et al., 1995; Goffeau et al., 1996; Prochnik et al., 2010). As a grad student, I spent countless hours pruning, editing, assembling, and occasionally oohing and awing over Sanger sequences (Sanger et al., 1977; Smith et al., 1986; Prober et al., 1987). These 800-nucleotide genetic snippets intrigued, inspired, and motivated me. They contained just enough data to pique my interests—a novel exon, strange repeat, or foreign gene—and always left me craving a bit more: one additional sequencing read to extend that PCR product, find that stop codon, or join those lonely contigs. Usually, it would take weeks or months to get that extra read, and when it arrived I would savor the experience, exploring and analyzing it like a new book from a favorite author. After I devoured the data, I would say to myself, “If only I could get my hands on a great number of sequencing reads from my organism of interest then all of my genomic woes would be over.” Naively, I believed that the more sequencing data I had, the more productive I would be. Be careful what you wish for from the genome gods. The onslaught of next-generation sequencing (NGS) technologies (Metzker, 2010; Koboldt et al., 2013) and the access to previously unfathomable amounts of genomic data have made me dizzy, disillusioned, and anything but efficient. Like the proverbial boiling frog, my mind is gradually overheating from an accumulation of NGS reads (Liu et al., 2012). It’s a paired-end nightmare, a SOLiD pain in the neck, and a massively parallel migraine. All this HiSeq and MiSeq is clogging-up my internal drive and externals disks. I’ve taken vacations and returned home only to find that my Illumina reads still haven’t finished downloading. I can’t move or backup a FASTQ file without needing a coffee break. Last month it got so bad that I tried calling 911 on my 454. I’m certain that I would have had two Nature papers by now if it weren’t for that pestering computer cursor that keeps spinning around and around, reminding me of my small memory and pitiful processing power. With all this NGS information, what have I gained (apart from being a chronic user of SEQanswers.com)? Well, I’m a co-investigator of a half a dozen, highly fragmented nuclear genome assemblies for various green algae, with no genome papers anywhere in sight. And don’t get me started on the number of transcriptome projects waiting to be written up. What’s worse is that I’m still sending more samples for sequencing. It’s becomemy default setting: when in doubt, sequence. If a colleague drops by my office and says, “Smitty, you interested in milkweeds?” My first response is, “You betcha. Let’s send some for sequencing?” Student asks: “Professor Smith, do you have any ideas for my honors thesis?” “Hmmm,” I say, “how about we sequence another green alga.” Grant money left over, what do I do? You guessed it: two for one RNA-seq at the campus sequencing facility. And if the data come back contaminated or the quality is poor? Easy, I sequence more! It’s gotten to the point where I should begin my conference presentations with, “Hello, my name is David and I’m a NGS addict.” There are some positives to being NGS obsessed. I’m constantly testing and learning the newest bioinformatics software and genome assembly programs. I know all of the hippest genome slang and genetic acronyms. I have learned more than I ever wanted to about Linux, Unix, and Perl, although, as my students regularly point out, I’m still a hack in all three of those areas. I love that I can go to the Sequence Read Archive at the National Centre for Biotechnology Information (Leinonen et al., 2011) (I visit the site incessantly) and in seconds access endless amounts of raw genomic and transcriptomic data from some of the coolest and most bizarre species on earth, and then use these data to mine genes for phylogenetic and other comparative analyses. I’m also an organelle genome junkie, and NGS techniques have made it

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

I-20: Towards The Transparent Embryo: Dynamics and Ethics of Comprehensive Preimplantation Genetic Screening

Background: To study the ethical aspects of comprehensive preimplantation genetic screening (PGS) through microarrays and whole genome sequencing Materials and Methods: In order to pinpoint ethical issues regarding comprehensive embryo screening we have first investigated the technical and moral issues by organizing a campus meeting with experts and by a literature study. Subsequently we have i...

متن کامل

Genome Wide Association Studies, Next Generation Sequencing and Their Application in Animal Breeding and Genetics: A Review

Recently genetic studies have been revolutionized by next generation sequencing (NGS) technology, and it is expected that the use of this technology will largely eliminate defects in the methods of association studies. The NGS technology is becoming the premier tool in genetics. However, at the moment the use of this method is limited especially in the livestock due to high cost and computation...

متن کامل

Targeted Amplicon Sequencing (TAS): A Scalable Next-Gen Approach to Multilocus, Multitaxa Phylogenetics

Next-gen sequencing technologies have revolutionized data collection in genetic studies and advanced genome biology to novel frontiers. However, to date, next-gen technologies have been used principally for whole genome sequencing and transcriptome sequencing. Yet many questions in population genetics and systematics rely on sequencing specific genes of known function or diversity levels. Here,...

متن کامل

I-37: Establishing High Resolution Genomic Profiles of Single Cells Using Microarray and Next-Generation Sequencing Technologies

The nature and pace of genome mutation is largely unknown. Standard methods to investigate DNA-mutation rely on arraying or sequencing DNA from a population of cells, hence the genetic composition of individual cells is lost and de novo mutation in cell(s) is concealed within the bulk signal. We developed methods based on (SNP-) arraying and next-generation sequencing of single-cell whole-genom...

متن کامل

Whole-Genome Sequencing of a Clinically Isolated Antibiotic-Resistant Enterococcus faecium EntfacYE

Background and Objective: Enterococcal infections are considered the most common nosocomial infections. Nowadays, enterococci show high resistance to common antibiotics, especially vancomycin. Vancomycin-resistant Enterococcus faecium is one of the most common nosocomial infections, which is included in the World Health Organization priority pathogens list for research and development of new an...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره 5  شماره 

صفحات  -

تاریخ انتشار 2014